Data collection of Japanese dialects and its influence into speech recognition

نویسندگان

  • Ikuo Kudo
  • Takao Nakama
  • Tomoko Watanabe
  • Reiko Kameyama
چکیده

This paper reports the successful completion of Japanese POLYPHONE project, Voice Across Japan (VAJ) data collection project. The database has the following characteristic, 1) large speakers database (8,866 spk.) through telephone line, 2) to gather participant's personal information such as gender, age, growing place, and so on, and 3) to put data segmented by phone or word boundary. This paper describes several aspects of Japanese dialects and also, reports the results of experiments. How much percents do dialects make influence on speech recognition. In our result, dialects makes 2-4% influence on speech recognition rate. The results are useful information for building practical speech recognition system as well as data collection.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parallel Speech Corpora of Japanese Dialects

Clean speech data is necessary for spoken language processing, however, there is no public Japanese dialect corpus collected for speech processing. Parallel speech corpora of dialect are also important because real dialect affects each other, however, the existing data only includes noisy speech data of dialects and their translation in common language. In this paper, we collected parallel spee...

متن کامل

The Short Vowels /i/ and /u/ in Iranian Balochi Dialects

The aim of the present paper is to study the status of the short vowels /i/ and /u/ in five selected Iranian Balochi dialects. These dialects are spoken in Sistan (SI), Saravan (SA), Khash (KH), Iranshahr (IR), and Chabahar (CH) regions located in province Sistan va Baluchestan in the southeast of Iran. This study investigates whether these two vowels have the same qualities as the short /i/ an...

متن کامل

Statistical Method of Building Dialect Language Models for ASR Systems

This paper develops a new statistical method of building language models (LMs) of Japanese dialects for automatic speech recognition (ASR). One possible application is to recognize a variety of utterances in our daily lives. The most crucial problem in training language models for dialects is the shortage of linguistic corpora in dialects. Our solution is to transform linguistic corpora into di...

متن کامل

Towards a Localised German Automatic Speech Recognition

Spoken languages are often rich in regional accents and dialects. These local variations often pose challenges to automatic speech recognition. In this study, we analyse the influence of German regional accents on the performance of a large vocabulary continuous speech recogniser trained on standard German data. The experiments show a large variation in the error rate over different regions. We...

متن کامل

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract   Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996